165 research outputs found

    Bayesian mixture labeling and clustering

    Get PDF
    Label switching is one of the fundamental issues for Bayesian mixture modeling. It occurs due to the nonidentifiability of the components under symmetric priors. Without solving the label switching, the ergodic averages of component specific quantities will be identical and thus useless for inference relating to individual components, such as the posterior means, predictive component densities, and marginal classification probabilities. In this article, we establish the equivalence between the labeling and clustering and propose two simple clustering criteria to solve the label switching. The first method can be considered as an extension of K-means clustering. The second method is to find the labels by minimizing the volume of labeled samples and this method is invariant to the scale transformation of the parameters. Using a simulation example and two real data sets application, we demonstrate the success of our new methods in dealing with the label switching problem

    Mixture of Regression Models with Single-Index

    Full text link
    In this article, we propose a class of semiparametric mixture regression models with single-index. We argue that many recently proposed semiparametric/nonparametric mixture regression models can be considered special cases of the proposed model. However, unlike existing semiparametric mixture regression models, the new pro- posed model can easily incorporate multivariate predictors into the nonparametric components. Backfitting estimates and the corresponding algorithms have been proposed for to achieve the optimal convergence rate for both the parameters and the nonparametric functions. We show that nonparametric functions can be esti- mated with the same asymptotic accuracy as if the parameters were known and the index parameters can be estimated with the traditional parametric root n convergence rate. Simulation studies and an application of NBA data have been conducted to demonstrate the finite sample performance of the proposed models.Comment: 28 pages, 2 figure

    Fully Bayesian Logistic Regression with Hyper-Lasso Priors for High-dimensional Feature Selection

    Full text link
    High-dimensional feature selection arises in many areas of modern science. For example, in genomic research we want to find the genes that can be used to separate tissues of different classes (e.g. cancer and normal) from tens of thousands of genes that are active (expressed) in certain tissue cells. To this end, we wish to fit regression and classification models with a large number of features (also called variables, predictors). In the past decade, penalized likelihood methods for fitting regression models based on hyper-LASSO penalization have received increasing attention in the literature. However, fully Bayesian methods that use Markov chain Monte Carlo (MCMC) are still in lack of development in the literature. In this paper we introduce an MCMC (fully Bayesian) method for learning severely multi-modal posteriors of logistic regression models based on hyper-LASSO priors (non-convex penalties). Our MCMC algorithm uses Hamiltonian Monte Carlo in a restricted Gibbs sampling framework; we call our method Bayesian logistic regression with hyper-LASSO (BLRHL) priors. We have used simulation studies and real data analysis to demonstrate the superior performance of hyper-LASSO priors, and to investigate the issues of choosing heaviness and scale of hyper-LASSO priors.Comment: 33 pages. arXiv admin note: substantial text overlap with arXiv:1308.469

    Nonparametric and Varying Coefficient Modal Regression

    Full text link
    In this article, we propose a new nonparametric data analysis tool, which we call nonparametric modal regression, to investigate the relationship among interested variables based on estimating the mode of the conditional density of a response variable Y given predictors X. The nonparametric modal regression is distinguished from the conventional nonparametric regression in that, instead of the conditional average or median, it uses the "most likely" conditional values to measures the center. Better prediction performance and robustness are two important characteristics of nonparametric modal regression compared to traditional nonparametric mean regression and nonparametric median regression. We propose to use local polynomial regression to estimate the nonparametric modal regression. The asymptotic properties of the resulting estimator are investigated. To broaden the applicability of the nonparametric modal regression to high dimensional data or functional/longitudinal data, we further develop a nonparametric varying coefficient modal regression. A Monte Carlo simulation study and an analysis of health care expenditure data demonstrate some superior performance of the proposed nonparametric modal regression model to the traditional nonparametric mean regression and nonparametric median regression in terms of the prediction performance.Comment: 33 page

    Editor\u27s Preface and Table of Contents

    Get PDF
    These proceedings contain papers presented in the twenty-third annual Kansas State University Conference on Applied Statistics in Agriculture, held in Manhattan, Kansas, May 01 - May 03, 2011
    • …
    corecore